A compartmental model for smoking dynamics in Italy: a pipeline for inference, validation, and forecasting under hypothetical scenarios

We propose a compartmental model for investigating smoking dynamics in an Italian region (Tuscany). Calibrating the model on local data from 1993 to 2019, we estimate the probabilities of starting and quitting smoking and the probability of smoking relapse. Then, we forecast the evolution of smoking prevalence until 2043 and assess the impact on mortality in terms of attributable deaths. We introduce elements of novelty with respect to previous studies in this field, including a formal definition of the equations governing the model dynamics and a flexible modelling of smoking probabilities based on cubic regression splines. We estimate model parameters by defining a two-step procedure and quantify the sampling variability via a parametric bootstrap. We propose the implementation of cross-validation on a rolling basis and variance-based Global Sensitivity Analysis to check the robustness of the results and support our findings. Our results suggest a decrease in smoking prevalence among males and stability among females, over the next two decades. We estimate that, in 2023, 18% of deaths among males and 8% among females are due to smoking. We test the use of the model in assessing the impact on smoking prevalence and mortality of different tobacco control policies, including the tobacco-free generation ban recently introduced in New Zealand. Supplementary Information The online version contains supplementary material available at 10.1186/s12874-024-02271-w.

The assumptions underlying the SHC model are summarized below: • the probability of starting smoking γ i (a) depends on age and smoking intensity; • people can start smoking between the ages of 14 and 34.Note that this assumption relaxes the one in Levy et al. [1], where the maximum initiation age was set to 24; • the probabilities of starting smoking γ i (a) depend on the age through γ(a), while the distribution of the level of smoking intensity π π π is assumed to be constant over time and age; • the distribution of the new smokers by smoking intensity does not depend on age and calendar time; • smokers do not change their smoking intensity during their entire life (this also implies that if an ex-smoker goes back to smoking, her/his smoking intensity is the same as when she/he first started smoking).This assumption is in line with evidence that smoking patterns tend to stabilize during the first years after smoking initiation [2].It is also supported by evidence that the number of "reducers" and "increasers" is similar, leading to an overall balance of the transitions [3]; • the probability of stopping smoking ε(a) depends only on age; • people can quit smoking only after 20 years of age.Note that this assumption relaxes the one in Levy et al. [1], where smoking cessation is not allowed before age 25; • the probability of relapsing η(c) changes with time since smoking cessation, but does not depend on age and smoking intensity; • after 15 years since smoking cessation, the probability of smoking relapse becomes constant; • the rates of quitting depend on age but does not depend on the level of smoking intensity; • an ex-smoker who first relapses, then stops smoking again, becomes a 0-year former smoker; • the population is closed to immigration and emigration (but we considered new births and deaths); • the risk of death depends on the age for never smokers (δ N (a)), both on smoking intensity and age for current smokers (δ C i (a)), and on time since smoking cessation and age for former smokers (δ F (a, c)).For the reason of simplicity, we do not consider the level of smoking intensity in the definition of the mortality for former smokers: • the mortality rate of current smokers does not depend on the time from starting smoking; • the mortality rate of former smokers does depend on the time from smoking cessation and on age; • for each individual, only one event among starting, quitting, relapsing, being born or dying occurs in the year and we assume that it happens at the end of the year; • at each time the probabilities of starting and quitting smoking and the probability of smoking relapse are defined among those who do not die during the year (they are conditional probabilities); • all the transition rates do not change with t.

S2 Details on the fixed parameters
The values assigned to the fixed parameters of the SHC model are detailed below: • the vector of proportions π π π was set to the average proportions of

S3 Additional details on Global Sensitivity Analysis
Table S3.1:Distribution of the input parameters used in the GSA and related data sources.

S4 Population Attributable Fraction computation
The Population Attributable Fraction for the class of age a at time t, PAF(t; a), is calculated as the proportion of deaths that would be avoided if all current and former smokers of age a at time t in the population were never smokers [5]: .
Analogously, the overall PAF at time t is: .

S5 Additional results
Table S5.1:Estimated prevalence (%) of never, current, and former smokers in the population with 90% confidence intervals, evaluated every 10 years from 1993 to 2043 for males, by period of calibration.

Fig. S5. 1 :
Fig. S5.1: Results of the two-step estimation procedure for males by period of calibration (from 1993 to 2004 in a light colour and from 2005 to 2019 in a dark colour): Estimated Population Attributable Fraction (PAF) and number of Smoking Attributable Deaths (SAD), with 90% confidence bands, for people over 35 years old (a) and over 65 years old (b).

Fig. S5. 2 :
Fig. S5.2: Results of the two-step estimation procedure for females by periods of calibration (from 1993 to 2004 in a light colour and from 2005 to 2019 in a dark colour): Estimated Population Attributable Fraction (PAF) and number of Smoking Attributable Deaths (SAD), with 90% confidence bands, for people over 35 years old (a) and over 65 years old (b).

Fig. S5. 3 :
Fig. S5.3:Estimated Population Attributable Fraction (PAF) and number of Smoking Attributable Deaths (SAD) among people over 35 years of age, with 90% confidence bands, for males (a) and females (b) under different tobacco control policies (TCP).

Fig. S5. 4 :
Fig. S5.4:Estimated Population Attributable Fraction (PAF) and number of Smoking Attributable Deaths (SAD) among people over 65 years of age, with 90% confidence bands, for males (a) and females (b) under different tobacco control policies (TCP).
number of never, current and former smokers, by age and sex, was obtained applying to the resident population in Tuscany on the 1 st of January 1993 (t = 0) (http: //www.istat.it/) the prevalence of never, current, and former smokers estimated from the 1993 ISTAT AVQ survey (www.istat.it/it/archivio/91926),as well as the smoking intensity distribution from the ISTAT AVQ and EHIS surveys (www.istat.it/it/archivio/167485) for current and former smokers, respectively.For details on population size and the prevalence of never, current, and former smokers in 1993 see Figures S2.1-S2.3.
4ote that in the sensitivity analysis described in Section 3.4 of the manuscript, we used the same quantities referred to the year 2005 (Figures S2.4and S2.5); • in order to quantify C i (0; a), the initial number of current smokers in 1993, stratified by age, obtained as described at the previous point, has been multiplied by the proportions of low, medium and high-intensity smokers arising from the ISTAT AVQ survey carried out from 1993 to 2019 (Figure S2.1).This procedure has been applied separately for males and females; • in order to quantify F i (0; a, c), first we multiplied the initial number of former smokers in 1993, stratified by age, by the proportions of low, medium and high-intensity exsmokers arising from the ISTAT EHIS surveys (www.istat.it/it/archivio/167485)carried out in 1994, 1999, 2004, and 2013.Then, we used the distribution of former smokers by time from smoking cessation in 1993 to obtain the initial compartment sizes.This procedure has been applied separately for males and females.For details, see Figures S2.6 and S2.7; • the relative risks for current and former smokers versus never smokers were obtained from the literature.Specifically, we used the rates estimated from the US population within the Cancer Prevention Study II and reported in the Supplementary Appendix of Thun et al. [4], Tables S4, S11, and S12.For details see Tables S2.1 and S2.2.

Table S2
[4] Relative Risk of current smokers by smoking intensity (RR C i ) for males and females.Source Thun et al.[4].TableS2.2:RelativeRisk of former smokers by time since smoking cessation (RR F i (c)) for males and females.Source Thun et al.[4].

Table S5
.8: Expected percentage decrease in number of Smoking Attributable Deaths (SAD) under different tobacco control policies (TCP1, TCP2, TCP3) with respect to the reference scenario (TCP0), in the years 2023, 2033, and 2043, with 90% confidence intervals, among males and females aged over 35 and over 65.